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Methods 
RNA preparation 

Double-stranded DNA templates for the RNAs of interest were constructed using PCR 
assembly with primers purchased from IDT (Integrated DNA Technologies) 4 . RNAs 
were transcribed at 37 °C for 3 hours in 320 /vl_ reactions containing 32 pmol of dsDNA 
template, 100 mM Tris-HCI, pH 8.1, 200 mM MgCI 2 , 3.5 mM spermidine, 0.1% Triton X- 
100, 40 mM DTT, 4% PEG 8000, 20 U T7 RNA Polymerase (New England Biolabs), 1 
mM NTPs, and 0.5 mM 2'-NH 2 -2'-deoxy-ATP (TriLink BioTechnologies). After purifying 
transcribed RNAs using RNA Clean-and-Concentrate columns (Zymo Research), RNAs 
were 5'-end labeled using a 5'-End-Tag kit and fluorescein maleimide (Vector Labs), 
then purified again using RNA Clean-and-Concentrate columns. The hydroxyl radical 
source, isothiocyanobenzyl-EDTA chelating Fe(lll) (ITCB-Fe(lll)'EDTA) (Dojindo 
Molecular Technologies, Inc.), was covalently attached to the 2'-NH 2 groups on the 
RNA backbone using a two-step process. First, to couple ITCB-EDTA to the RNA, the 
RNA was incubated with 0.5 mg ITCB-EDTA in 0.4 M KP0 4 , pH 8.5 for 37 °C for 12-16 
hours. Then, the coupling reactions were incubated with 67 mM FeCI 3 at room 
temperature for 15 minutes, after which 75-100 mM Na-EDTA, pH 8.0 was added to 
chelate excess Fe(lll). After purifying with RNA Clean-and-Concentrate columns to 
remove excess reagents, the RNA was PAGE-purified using denaturing 8% 
polyacrylamide gels. Bands were located by scanning with a Typhoon imager (GE) for 
fluorescein fluorescence, and excised gel slices were immersed in RNase-free water in 
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nonstick tubes overnight at 4 °C to elute the RNA. RNAs were purified from the eluate 
using RNA Clean-and-Concentrate columns and stored at -20 °C. 

Fragmentation and library preparation 
Activation of radical source 

Before activating the radical source to produce spatially localized hydroxyl radicals, 3-4 
pmol amounts of folded RNA were prepared in 50 mM Na-HEPES, pH 8.0, and 10 mM 
MgCI 2 , as follows. First the RNA was heated to 65 °C in HEPES buffer for 3 min, then 
cooled to room temperature for 10 min; then MgCI 2 was added and the RNA was heated 
to 50 °C for 5 min, then cooled to room temperature for 10 min. [For P4-P6, HEPES 
buffer and MgCI 2 were added concurrently, and the RNA was incubated for 10 min at 
room temperature. For ligand-bound glycine riboswitch samples, these incubation steps 
included 10 mM glycine. For ligand-bound adenosylcobalamin (AdoCbl) riboswitch 
samples, these incubation steps included 140 /vM AdoCbl, and all steps until post- 
fragmentation ethanol precipitation (below) were performed in the dark.] After folding, 
the radical source was activated by adding a 100 mM sodium ascorbate stock to a final 
concentration of 10 mM. A control reaction was also prepared with deionized water 
added instead of ascorbate; this reaction was carried through all subsequent steps in 
parallel. After 5 to 30 min of incubation at room temperature (10 min was standard), 
100 mM thiourea was added to a final concentration of 9.1 mM. The RNA fragments 
were ethanol precipitated as follows: First, 1 /vl_ GlycoBlue (Life Technologies), 1/10 
volume of 3 M sodium acetate, pH 5.2, and 3 volumes of 100% ethanol chilled on dry 
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ice were added to the reaction. Samples were spun down immediately in a tabletop 
microcentrifuge at maximum speed for 20 min, washed with ice-cold 70% ethanol, then 
spun down again at max speed for 10 min. The supernatant was removed by pipetting 
and the pellets were allowed to dry. 

Repair of 3' ends 

To remove 3'-phosphates left by hydroxyl radical strand scission events in the RNA 
backbone 19 ' 20 , the purified RNA fragments were treated with T4 polynucleotide kinase 
(T4 PNK) in conditions that promoted 3'-phosphatase activity 21 . The end-repair 
reactions contained 50 mM Na-MES, pH 6.0, 10 mM MgCI 2 , 5 mM DTT, 5 /vM ATP, and 
10 units T4 PNK (NEB) and were incubated at 37 °C for 30 min. After end-repair, RNAs 
were ethanol precipitated as above, except with 0.5 /vl_ GlycoBlue instead of 1 /vl_ 
GlycoBlue. 

Ligation of DNA tail 

To prepare the fragmented RNA for reverse transcription, a preadenylated and 3'- 
blocked DNA tail (Universal miRNA cloning linker, NEB) was ligated to the 3'-end of the 
end-repaired fragments. Each ligation reaction contained 1x T4 RNA ligase buffer 
(NEB), 15% PEG 8000, 4-5 pmol DNA tail, and 200 U T4 RNA ligase 2 truncated, 
K227Q or KQ mutant (NEB). The reactions were incubated at 4 °C for 12 hours, 
followed by heat inactivation at 65 °C for 20 min. Ligated RNAs were purified from the 
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reaction using RNA Clean-and-Concentrate columns, which also removed excess DNA 
tail. 

Reverse transcription with sequencing primers 

After ligation of the DNA tail, RNAs were reverse transcribed with sequencing primers 
containing, in 5' to 3' order, a 5'-fluorescein modification, the lllumina TruSeq Universal 
adapter, 12-nucleotide barcodes (sequence-balanced in sets of 4 primers), and a primer 
for the DNA tail sequence on the 3'-end (Supplementary Table 1). Three of the four 
primers were used for reverse transcription of the ascorbate-treated sample and the 
remaining primer was used for reverse transcription of the no-ascorbate control sample. 
The 15/vL reverse transcription reactions contained 1x First Strand buffer (Life 
Technologies), 5 mM DTT, 0.8 mM dNTPs, and 120 U Superscript III (Life 
Technologies) and were incubated at 55°C for 30 min. To degrade the RNA templates, 
5 /vL of 0.4 M NaOH was added and the samples were incubated at 90 °C for 3 min. 
After cooling on ice for 3 min, the cDNA samples were neutralized by addition of 1 /vL of 
an acid quench (2 mL 5 M NaCI, 2 mL 2 M HCI, and 3 mL 3 M Na-acetate) and then 
purified by incubating with DynaBeads magnetic beads (Life Technologies) conjugated 
to double-biotin-labeled ssDNA complementary to the TruSeq adapter. 

Ligation of second sequencing adapter 

The second sequencing adapter (Supplementary Table 1), derived from the TruSeq 
Indexed adapter with a 5'-phosphate (to enable ligation to the cDNA) and a 3'- 
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phosphate (to block circularization) was ligated onto the 3'-ends of the cDNAs while 
they were still annealed to DNA-coated DynaBeads. The four cDNA samples for each 
RNA were pooled prior to the ligation reaction. The 50 /vl_ ligation reactions contained 
1.25 /yM adapter, 1x CircLigase buffer (Epicentre), 50 /jM ATP, 2.5 mM MnCI 2 , 4% PEG 
1500, and 250 U CircLigase I (Epicentre) and were incubated at 68 °C for 2 hours, 
followed by heat inactivation at 80 °C for 10 min. The samples were purified by 
magnetic separation, and a fraction of the sample was run on a capillary electrophoresis 
machine (Applied Biosystems) with co-loaded fluorescein-labeled standards and 
analyzed using HiTRACE 8 to estimate the concentration of ligated cDNA. 

Sequencing 

Sequencing using lllumina MiSeq or HiSeq instruments 

The MOHCA-seq cDNA libraries were sequenced using 50-cycle MiSeq v2 kits on 
lllumina MiSeq instruments or using an lllumina HiSeq 2500 instrument (Elim 
Biopharmaceuticals). Beads harboring 25 fmol of single-stranded library fragments were 
mixed with 2.5 fmol of PhiX dsDNA control (lllumina) and EB buffer (Qiagen) to 5 /vl_ 
total. Then 5 /vl_ of 0.2 N NaOH was added and the fragments were eluted for 10 min at 
room temperature. The supernatant of the magnetic beads was diluted into chilled HT1 
buffer (1 0 juL added to 990 juL HT1 ), and then diluted again (375 juL added to 225 juL 
HT1) before loading all 600 /vl_ onto MiSeq kits following manufacturer instructions. 
Paired-end sequencing involved 51 and 25 sequencing cycles for the first and second 
reads, respectively. The resulting FASTQ data were processed with the MAPseeker 
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v1.2 software (https://github.com/DasLab/map_seeker), giving final text files in RDAT 
format corresponding to the number of fragments observed for each pair of possible 
cleavage and oxidation position (corresponding to 3' and 5' positions of cDNA 
fragments). These raw data have been deposited in the RNA Mapping Database 
(http://rmdb.stanford.edu/), with the following accession IDs: P4-P6 
(TRP4P6_MCA_0001-0004), glycine riboswitch with 10 mM glycine 
(GLYCFN_MCA_0002) and with 0 mM glycine (GLYCFN_MCA_0003), AdoCbl 
riboswitch with 140 /jM AdoCbl (RNAPZ6_MCA_0002) and with 0 /vM AdoCbl 
(RNAPZ6_MCA_0003), and class I ligase (CL1 LIG_MCA_0001 -0003). 

MOHCA-seq Data Analysis 

General MOHCA-seq analysis framework 

Analysis of MOHCA-seq data requires modeling backgrounds, modulation from reverse 
transcription attenuation, and sources of error. Here, we outline a general analysis 
framework for MOHCA-seq experiments, with the following two sections describing two 
independent statistical procedures developed to reach numerical solutions, which gave 
consistent proximity maps. The resulting proximity maps have been deposited in the 
RNA Mapping Database with the following accession IDs: P4-P6 
(TRP4P6_MCA_0000), glycine riboswitch with 10 mM glycine (GLYCFN_MCA_0000) 
and with 0 mM glycine (GLYCFN_MCA_0001), AdoCbl riboswitch with 140 AdoCbl 
(RNAPZ6_MCA_0000) and with 0 juM AdoCbl (RNAPZ6_MCA_0001), and class I ligase 
(CL1LIG_MCA_0000). 
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A MOHCA-seq product stemming from a radical cleavage event at nucleotide /'and a 
radical source at nucleotide /'corresponds to a sequence i+1 to j-1 ligated between the 
two inserts necessary for paired-end lllumina sequencing. The frequency /^yof such 
products is related to the proximity of /'and /but is modulated by the actual distribution 
of radical sources, e.g., primarily at adenosines for 2'-NH 2 -dATP-incorporating 
transcripts. The frequency is also suppressed by signal attenuation for long sequence 
separations /-/'due to the possibility of reverse transcription termination between /'and 
j. A master expression for these MOHCA-seq frequencies is: 



with s indexing the possible source positions (0, 1, ... N with s = 0 corresponding to no 
source) and e(s) the fraction of transcripts containing a source at s. Values pfgive the 
probability that a reverse-transcription-stopping event occurs at the nucleotide 
immediately 3' to /'for a transcript with source at s, and values qjgive the probability of 

a cleavage event between /- 1 and / for a transcript with source at s. This expression 
simplifies in the limit of no background processes and low radical cleavage and 

oxidation rates (pf,Qj «1, except atp/ =/ =1 , corresponding to a reverse transcriptase 

stop at the site of radical source attachment). In that limit, eq. (1) reduces to: 




(1) 



(2) 
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i.e. the observed frequencies provide a direct readout of the probability that a source at /' 
leads to radical cleavage at position j. This was the limit assumed in prior gel-based 
MOHCA analysis based on reading out /'through cleavage of phosphorothioate tags 
associated with the radical sources. In the present MOHCA-seq protocol, we found 
empirically that the reverse transcription readout of /' led to a more complete portrait of 
the proximity map, including information at nucleotides /'at which radical sources were 
not attached. Indeed, increasing the incorporation rate of radical sources or the time of 
ascorbate-induced radical damage produced higher signal-to-noise data sets with clear 
proximity map signals, despite bringing the analysis away from the regime of single-hit 
incorporation and damage (Supplementary Figs. 3 and 5). The extra information was 
derived from oxidative damage events that did not lead to backbone cleavage 
encapsulated in the term pf (compare Fig. 1b to ref. 13; these non-cleaving events 

were previously invisible to gel-based analysis), but required a more advanced analysis 
to elicit the signal from the raw data. 

In the general case, the number of observed frequencies F,-, is on the order of N(N- J \ )/2, 
whereas the total number of model parameters s(s), pf , and qj is substantially higher 
(> 2 N 2 ), leading to an ill-posed problem. However, basic chemical considerations 
reduce the number of parameters. First, we assume that the cleavage fraction gj is 
composed of a uniform 'background' rate of cleavage bj that is independent of source s 
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{e.g., due to inline attack during cleavage) and additional cleavage due to hydroxyl 
radical cleavage events (the desired MOHCA-seq signal), parameterized by : 

Cleavage at j from source s = = 1 - (1 - 6 y )(1 - n*) » b j + ;r y s (3) 

Second, we assume that the stopping events pf are composed of a background rate r, 

of oxidative damage (e.g., due to solution radicals arising during ascorbate treatment) 
and then additional oxidation due to hydroxyl radical damage events: 

Stop at i from sou rce s = p. = 1 - (1 - r { )(1 - pX ) « /; + pX (4) 

In eqs. (3) and (4), a reduction in the number of parameters arises from assuming that 
oxidative damage events producing reverse transcription stops occur at rates 
proportional to backbone cleavage rates by a factor p, . That is, parameterizes the 
local effective concentration of radicals at /'from source s, and the partitioning of these 
radicals into events that lead to cleavage (contributing to q- ) versus total damage that 

can terminate reverse transcription (contributing to pf ) is dependent on the chemical 
environment of the site / and not on source s. The total number of parameters thus 
reduces from greater than 2N 2 to N(N-~\ )/2 for ji? and 3/Vfor bj, n, and e(s). By further 

enforcing positivity of each of these parameters and assuming that nf is sparse, i.e. 

that each nucleotide gives non-negligible cleavage at a number of residues smaller than 
N, the number of parameters is reduced to well below the number of observables, and 
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the problem becomes well-posed. Requiring sparsity of the proximity map is similar to 
assumptions used in solvent flattening and other density modification approaches in 
crystallography 22 . 

Determining proximity information from raw observables still requires solving a complex 
system of non-linear equations. We found that direct least-squares optimization of the 
thousands of variables a? to fit the observed F,,, including Laplace priors to enforce 

sparsity, required hours even with state-of-the-art numerical optimizers. We instead 
developed two rapid, iterative strategies to carry out the solution (COHCOA and 
LAHTTE, described next) and used differences between the results to evaluate 
systematic errors in analysis assumptions. 

Closure-based -OH Correlation Analysis (COHCOA) 

A Closure-based 'OH Correlation Analysis solves eq. (1) to determine a two-point 
correlation function that is directly read out by MOHCA-seq. In the limit that the fraction 
damaged at any single nucleotide is smaller than one, 



s i<m< j 



(5) 




i<m< j 



where the bracket notation refers to a summation over sources: 
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(6) 



and terms beyond second order are neglected. The effects of the radical cleavage result 
in one-dimensional damage profile R, and cleavage profile By and a two-point correlation 
function which encodes the desired proximity map: 

B i- b j + (*j) 

R,-r,+p,{n) (7) 
Q#-P/[(^/>-(^/>(^>] 

leading to the equation 

F tj - ^ + , - 2 fe.i?^, + i?,„e y + RQ mj + Pm Q im B j ) (8) 

i<m<j 

In the derivation of (8), we have dropped higher order terms corresponding to neglecting 
higher cumulants of the damage fuction (e.g., the three-point cumulant 

J imj =^v / )i%)^S-(^)k)-^}i^) + %)(^)k} ) t0 match the lowest 

order assumed in eq. (5). We also note that stops due to the possibility of more than 
one radical source attached to the transcript [neglected in the derivation above] can be 
modeled accurately, to first order, by including a rate e(/) within the general 'background' 
stopping rate R\ (not shown). In general, all processes that lead to stops or cleavage 
across all transcripts are subsumed into R ; and By. As a corollary to this simplification, 
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however, this framework does not seek or enable deconvolution of the separate 
contributions to each of these background rates. 



In practice, eq. (8) can give unphysical negative values if the subtracted summand 
becomes large. Following a strategy used in, e.g., reference interaction site models for 
solvation, we 'close' the expansion by solving an equation system that is equal to (8) at 
lowest order, exact in certain limits, and guaranteed to give positive results: 



Intuitively, many of the observed products F,\ are due to uncorrelated cleavage events B\ 
and reverse transcription stop events R\, which produce a 'plaid' background pattern. On 
top of this background is the desired two-point correlation signal which is non-zero 
only when nucleotides / and /have both been chemically modified by the same proximal 
source. Modulating these signals is an attenuation factor Ay which parameterizes the 
loss of signal, as a reverse transcriptase must polymerize from j back to /'. This factor 
depends on the general background stop rate R u but also includes two additional terms 
representing the possibility for additional damage correlated with the observed cleavage 
event at j and the observed stopping event at /'. (For simplicity and based on separate 
experiments, we fixed p, the ratio of chemical modification to backbone cleavage, as a 

constant at 2.5; changing this value from 1 to 5 gave indistinguishable results.) We note 
that the equation (9a,b) is exact in the case of negligible Qj; see references 23,24 . 



/•: = ,a/^ / + a) 




(9a,b) 
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The equations (9) are solved through iteration in a single script cohcoa.m available in 
the MAPseeker package. A starting estimate of R\ and attenuation matrix Ay comes from 
data corresponding to cleavages in the 3' flanking region, which is initially assumed to 
not give specific contacts with the target RNA domain. One-dimensional 'background' 
profiles R\ and B\ are determined which best fit the attenuation-corrected data F)/A,y. 
Subtracting the resulting background matrix HByfrom observed /^results in an initial 
solution for Q,y. Any point of Q/, that is negative is reset to zero, and these R u Bj, and Qjj 
give an updated solution for the attenuation matrix Ay- New estimates of R\ and By are 
derived from fitting Fy/Ay - Q tj and the process is iterated until convergence. This 
procedure does not require assuming symmetry of the two-point correlation function Q-,j, 
but in the end returns an estimate of this matrix only for (i<j), where there are data Fy. 
Propagating Poisson counting errors on Fjj in eq. 9b gives standard errors on Ay, and 
combining these errors in quadrature with the errors on Fy in eq. 9a gives final error 
estimates for the two-point correlation function Qj. Empirically, 20 or fewer iteration 
cycles lead to convergence for all data sets tested; 40 iterations have been used in this 
study to ensure convergence of final values within 1% (taking less than 1 minute on a 
MacBook Pro 2.8 GHz Intel Core i7 running MATLAB 201 2B). A comparison of raw 
counts and COHCOA-analyzed data is shown in Supplementary Figure 1a-b. 

To visualize the data, we used a filter to remove points with a signal-to-noise ratio < 1 
and applied a 2D smoothing algorithm (Supplementary Figure 1c). The analyzed 



S14 



proximity maps presented in the figures were primarily generated by COHCOA analysis, 
with the exception of the comparison to LAHTTE analysis (described below) in 
Supplementary Figure 1d. 



Likelihood Analysis of Hydroxyl-probed TerTiary contact Estimation (LAHTTE) 

An alternative approach to obtain an estimate of the "proximity map" underlying the 
MOHCA-seq data is to simplify eq. (1) by only inferring the fraction of sequence 
fragments that are the result of source-induced cleavage in their 3'end and of reverse 
transcription stoppage due to the source itself at their 5'-end. This simplification drops 
contributions to the contact-map from sequence fragments that were a result from 
reverse transcription stoppage occurring before the source position due to natural 
reverse transcription drop-off or strand damage and is therefore an underestimate of the 
true contact probabilities. In this framework, we can rewrite Fy as follows: 



F.. = f ,.p/(l - R,)(\ - 5 ; .)fT(l - R r )(\ - - £,)(! - pf) + 

i'=i+\ 

R t (\ - B,)(l - e,)(l - pi)t[ (1 - - - e r )(l - pf) + 

i'=i+i 

R t B ..(1 - 5,0(1 - e,)(l - p/)fT(l - *,)(1 - 5,)(1 - e r )(l - pf) 

The first term corresponds to the event of interest: when the source at /' is the cause of 
reverse transcription stoppage. The second and third terms correspond to reverse 
transcription stoppage due to background variables: in one event the 3'-end is cleaved 
by a source other than £, and in the other the cleavage event is due to background 
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cleavage {Bj). The terms TT(1-^ ")are the result of excluding the possibility of reading 



out cleavage events caused by sources that are downstream (3'), which cannot be 
detected due to the nature of the MOHCA-seq protocol. 

Letting X IS be the number of sequence fragments observed from position /'to /, we can 
write the likelihood of the data as: 



L(p,B,R) = Y\F* 

'J 



and the log-likelihood: 



LL(p,B,R) = 2 X ,j l0 S(F,j) 

ij 



(10) 



We can then find a maximum likelihood estimator for p by differentiating (10) and setting 
it to zero. For brevity, we define the following variables: 



i; =^.(1-^X1-5,.) 



( B \ 

1- 



K^s^-R^l-B^ 



Then, differentiating (10) with respect to p and setting to zero gives: 
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pj = max 



2A, 



/ J. 



V s 



(11) 



As in the general model for chemical mapping proposed previously 24 , probabilities that 
are estimated to be negative are set to zero. 



Notice that in (11), the term yl.,. + yJ., represents subtracting the background in 

J 

the two dimensions indexed by / and j. Eliminating any of the summands reduces (1 1) to 
an equation similar to the solution for reactivity probabilities in chemical mapping. This 
"two-dimensional background" is readily observed in the raw data and is the primary 
structure exploited by the COHCOA analysis described above. 



To calculate the variance of our maximum likelihood estimator, we use the second 
derivative of the log-likelihood with respect to p. Letting 

C v =pj£ i (l-R i )(l-B i )-(l-pj)r ij , we then have that: 



Var(pi) = 



1 

g ./w+i 
(i-W) 2 
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All that remains is to obtain estimators for s, R, and B. However, due to the appearance 
of Si, Rj, and B, in multiple fv, terms in (10), the likelihood function may have at most N 
different solutions for each e h R h and B h where N is the sequence length. It is therefore 
desirable to obtain the background variables using additional data. For example, £can 
be estimated by comparing the MOHCA-seq data to a no-ascorbate control using the 
standard formalism for obtaining reactivity probabilities in chemical mapping 
experiments. Furthermore, a solution hydroxyl radical cleavage assay can help estimate 
f?and B. As a final note, we found, surprisingly, that the maximum likelihood estimator 
of p does not depend strongly on the values of background variables e, R, and B: it 
seems that the estimator is sensitive only to specific variable orderings (e.g. s > R > B, 
s> R <B, etc.). An example of LAHTTE performed on a P4-P6 dataset is shown in 
Supplementary Figure 1d. 

Quantile normalization of RNAs probed in multiple states 

For RNAs probed in multiple states, such as the glycine and AdoCbl riboswitches, we 
found that differences in the proximity maps were best visualized using difference maps. 
To produce these maps, we first quantile-normalized COHCOA-analyzed MOHCA-seq 
datasets corresponding to the two states of interest. Quantile-normalized maps are 
shown in Figure 2a-b and d-e as well as in Supplementary Figures 1 1a-b and 12a-b. We 
then subtracted the ligand-free data from the ligand-bound data. Lastly, we used a filter 
to remove points with a signal-to-noise ratio < 1 and applied a 2D smoothing algorithm. 
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Supplementary Figures 11c and 12c show the resulting difference data plotted on 
heatmaps with positive and negative signal cutoffs of three standard deviations. 

Mutate-and-map 

High-confidence secondary structures derived from mutate-and map (M 2 ) experiments 
were used as inputs to computational modeling. We previously collected M 2 data on the 
AdoCbl riboswitch in the presence of 60 /jM AdoCbl [RMDB ID RNAPZ6_1 M7_0002] 
and separately performed M 2 on the class I ligase. The data were analyzed as 
described previously 4, 25 . Supplementary Figures 7 and 9 present M 2 data and analysis 
for the class I ligase and AdoCbl riboswitch, respectively. 

Computational modeling 

De novo RNA modeling with MOHCA-seq constraints 

We used the Rosetta software (version r56277) to model all RNAs of interest in three 
steps. Modeling for the AdoCbl riboswitch was performed with some differences, as 
noted below. First, based on the secondary structures of the RNAs from crystal 
structures or chemical mapping experiments, we pre-assembled the helix regions of the 
RNAs using fragment assembly of RNA with full atom refinement (FARFAR), using the 
python script helix_preassemble_setup. py. For each helical region, 100 FARFAR 
models were generated, with the following sample Rosetta command line: 

rna_denovo -nstruct 100 -params_f ile helix. params -fasta helix. fasta 
-out :file:silent helix0.out -include_neighbor_base_stacks - 
minimize_rna true -rna : : corrected_geo -score: rna_torsion_potential 
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RNAll_based_new -geom_sol_correct_acceptor_base - 

chemical :: enlarge_H_lj -score : weights rna/rna_helix -cycles 1000 - 

output_res_num 136-142 221-227 -output_res_num 136-142 221-227 

Second, we performed low-resolution modeling using the pre-assembled helices and 
tertiary constraints derived from maxima in the 2D MOHCA-seq proximity map. Tertiary 
constraints were determined by selecting peaks from early analyzed data that were (1) 
distinguishable from the local background signal by unbiased inspection and (2) not 
attributable to secondary structure. This produced a list of pairs of residues that were 
suggested to be spatially proximal by the MOHCA-seq data (tabulated in Supplementary 
Table 2). We sorted the selected pairs for each RNA into strong and weak hits based on 
the apparent intensity of the signal relative to the background. For each pair of residues 
showing a strong MOHCA-seq hit, we constrained the distance between the 02' of first 
residue and C4' of the second residue using a potential of the following form (see graph 
in Supplementary Fig. 6): 



S(x) = 3x 2 -2x 3 

'4(1-5(^/15)) 
45(J/15-1) 
4 + 365(^/30-1) 
40 



E(d) = . 



if 0< d<\5 
if 15 < J<30 
if 30 < J<60 
if d > 60 



Here S is the smoothstep function, d is the distance between the atom pair, and E is the 
constraint potential in Rosetta unit. For residue pairs with a weak MOHCA-seq signal, 
we applied a constraint potential of the same shape but weaker amplitude (1/5 of the 
original potential). With the constraints and the pre-assembled helices, we performed a 
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fragment assembly of RNA (FARNA) simulation to generate a large number of low- 
resolution Rosetta models (ranging from 10,000 to 60,000). For cases in which other 
experimental constraints (hydroxyl radical footprinting for the P4-P6 domain and class I 
ligase; tertiary information from mutate-and-map on the P4-P6 domain) were available, 
these constraints were also included in the modeling. A sample Rosetta command line 
is shown below: 



rna_denovo -nstruct 500 -params_f ile rna_pa rams . pa rams -fasta 
rna_f asta .f asta -out :f ile: silent rna.out - 
include_neighbor_base_stacks -minimize_rna false -native 
rna_native. pdb -in :f ile: silent helix0.out helixl.out helix2.out 
helix3.out helix4.out helix5.out helix6.out -input_res 1-8 19-26 12- 
17 122-127 42-46 84-88 55-61 113-119 62-66 72-76 79-82 89-94 111-112 - 
staged_constraints -cycles 20000 -output_res_num 1-127 

In the final step, we refined models from the initial run that had low Rosetta low- 
resolution scores (within the lowest 1/6 of the models) using the high-resolution Rosetta 
score to obtain final minimized models. This was achieved using the rna_minimize 
Rosetta application: 

rna_minimize -native rna_native.pdb -cst_fa_file constraint - 
params_file rna_params . params -skip_coord_constraints -in:file:silent 
0. silent -out :f ile: silent rna_min.out 

To find representative models, we clustered the lowest-energy models (0.5% of the total 
number of the unrefined models), with a threshold (based on all-heavy-atom RMSD) 
chosen so as to give 1/6 of the clustered models in the most populous cluster, as in 
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prior work 10 . The clustering was achieved by first finding the model (the 'cluster center') 
with the largest number of neighboring models within the RMSD threshold, then 
assigning these models to the first cluster (cluster 0). This process was then repeated to 
cluster all of the remaining models. Clustering was performed using Rosetta, with the 
following command line: 

cluster -in :file: silent silent. out -in:file:fullatom - 

out :f ile: silent_struct_type binary_rna -export_only_low false - 

out :file: silent cluster. out -cluster : radius 7 

The RMSD threshold described above was used as an estimate of the precision of the 
structure ensemble from the modeling. The accuracy of the ensemble was estimated 
using the median of the RMSD between each of the models in the most populous 
ensemble to the gold-standard crystal structure. The accuracy and precision of 
modeling for the AdoCbl riboswitch were determined slightly differently, as described 
below, to allow direct comparison to prior modeling efforts in the RNA-puzzle challenge. 
In addition, we calculated the percentage of strong and weak MOHCA-seq constraints 
satisfied by our models (averaged over all models in each cluster) and the crystal 
structure. We defined a constraint as 'satisfied' if the 02' of the first residue was within 
30 A of the C4' of the second residue. The MOHCA-seq-guided models satisfied 80% or 
more of the strong constraints and 60% or more of the weak constraints. As expected, 
the models generally satisfied a greater percentage of constraints than the crystal 
structures; however, the percent of constraints satisfied was not correlated with the 
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accuracy or precision of the models. The details of the modeling results and the 
statistics computed as discussed above are available in Supplementary Table 3. 

For the AdoCbl riboswitch, we did not pre-assemble helices and instead used prebuilt 
RNA fragments that we previously generated during the RNA-puzzles challenge. These 
fragments consisted of three models each of the P1 through P6, P7 through P8, and 
P10 through P1 1 regions. We modeled the AdoCbl riboswitch with three distinct 
combinations of these fragments, each containing one fragment of each region. To 
estimate the precision of modeling for the AdoCbl riboswitch in the ligand-bound and 
ligand-free states, we calculated the average pairwise all-heavy-atom RMSD between 
the cluster centers of the largest clusters for the three modeling setups. The accuracy of 
the modeling for each ligand-binding state was determined by the median of the 
accuracies for the three modeling setups. The prior models generated for the RNA- 
puzzle 6 challenge can be viewed on the RNA-puzzles website (http://paradise-ibmc.u- 
strasbg.fr/rnapuzzles/index.html). 
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Supplementary Figure Legends 
Supplementary Figure 1. MOHCA-seq data analysis. 

(a) Raw counts for example P4-P6 MOHCA-seq data. Following paired-end sequencing, 
the MAPseeker software is used to align the reads to the sequence of the RNA that was 
probed. The counts are recorded in an RDAT file, which is analyzed by Closure-based 
•OH Correlation Analysis (COHCOA). (b) COHCOA analysis after 40 iterations on 
example P4-P6 data. A full description of the COHCOA analysis can be found in the 
Methods section, (c) Final analyzed example P4-P6 proximity map. A filter is applied to 
remove points with signal-to-noise ratio < 1 , and a 2D smoothing algorithm is applied, 
(d) LAHTTE analysis of the same example P4-P6 MOHCA-seq data. A full description 
of the LAHTTE analysis can be found in the Methods section. 

Supplementary Figure 2. Full MOHCA-seq proximity maps, including 5' and 3' 
buffer regions. 

(a-f) Complete MOHCA-seq proximity maps, including 5' and 3' buffer regions, which 
are excluded from the proximity maps shown in the main text. All RNAs except for P4- 
P6 include 5' and 3' reference hairpins used for normalizing chemical mapping data 
from other techniques (TH Mann, WK, RD, personal communication) (Supplementary 
Table 1). 

Supplementary Figure 3. Variation of radical source incorporation rate. 

MOHCA-seq data for P4-P6. Ratio of concentrations of 2'-NH 2 -2'-dATP to ATP in 
transcription reaction: (a) 0; (b) 0.2; (c) 0.5 (standard); (d) 1.25. All fragmentation 
reactions were performed for 10 minutes. 

Supplementary Figure 4. Alternative nucleotide attachment sites for the radical 
source. 

MOHCA-seq data for P4-P6. Modified nucleotide triphosphate included in transcription 
reaction at molar ratio of 0.5 to standard NTP: (a) 2'-NH 2 -2'-dATP; (b) 2'-NH 2 -2'-dUTP; 
(c) 2'-NH 2 -2'-dGTP; (d) 2'-NH 2 -2'-dCTP. All fragmentation reactions were performed 
for 30 minutes. All data was collected in one lllumina MiSeq run using a 50-cycle MiSeq 
Reagent Kit v2. 

Supplementary Figure 5. Variation of fragmentation reaction time. 

MOHCA-seq data for P4-P6. Fragmentation reaction time: (a) 0 min (no ascorbate 
added); (b) 5 min; (c) 12.5 min; (d) 30 min. All fragmentations reactions used a 2'-NH 2 - 
2'-dATP:ATP ratio of 0.5. The P4-P6 construct used for this experiment included 
flanking hairpins 5' and 3' reference hairpins used for standardizing chemical mapping 
data (TH Mann, WK, RD, personal communication) (Supplementary Table 1). All data 
was collected in one lllumina MiSeq run using a 50-cycle MiSeq Reagent Kit v2. 
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Supplementary Figure 6. Rosetta scoring potential used to incorporate MOHCA- 
seq constraints in modeling. 

Shown is the scoring potential for strong MOHCA-seq hits, which was generated using 
the smoothstep function as described in the Methods. The scoring potential for weak 
MOHCA-seq hits was the same except with 5-fold lower amplitude. 

Supplementary Figure 7. Mutate-and-map (M 2 ) analysis of class I ligase using 
1M7 modifier. 

(a) Mutate-and-map dataset for 1 M7 modification across 120 single mutations along the 
class I ligase sequence, (b) Z-score contact map extracted from (a), (c) Secondary 
structure prediction and (d) bootstrap support matrix using M 2 data. In (b) and (d), the 
crystallographic secondary structure is overlaid as cyan and green circles, with an 
alternative P3 helix predicted by M 2 data overlaid as red circles. When SHAPEknots 26 
was used to predict the pseudoknots in the full sequence, only the P2 pseudoknot was 
recovered. However, SHAPEknots successfully predicted the P3 helix when the P1 helix 
was omitted. 

Supplementary Figure 8. Comparison of class I ligase crystal structure and 
knotted and unknotted cluster centers. 

(a) Crystal structure of the class I ligase. (b) Cluster center of a knotted cluster of 
MOHCA-seq-guided models; the 3'-end of the RNA is knotted (black arrow), (c) Cluster 
center of an unknotted cluster of models. 

Supplementary Figure 9. Mutate-and-map analysis of the adenosylcobalamin 
riboswitch, acquired during the sixth RNA-puzzles structure prediction trials. 

(a) Mutate-and-map dataset for 1M7 modification across 168 single mutations along the 
AdoCbl riboswitch sequence in the presence of 60 /vM AdoCbl ligand. Mutants showing 
poor data quality are marked by red bars, (b) Z-score contact map extracted from (a), 
(c) Secondary structure prediction and (d) bootstrap support matrix using M 2 data. In (b) 
and (d), the crystallographic secondary structure is overlaid as cyan circles. 

Supplementary Figure 10. Comparison between MOHCA-seq-guided models of 
the adenosylcobalamin riboswitch using different initial RNA fragment sets. 

(a) Crystal structure of S. thermophilum AdoCbl riboswitch (PDB ID 4GXY) and 
MOHCA-seq models generated using the three distinct sets of prebuilt RNA fragments 
(labeled setups 1-3) in the presence of (b) 140 /vM AdoCbl ligand or (c) 0 /vM AdoCbl 
ligand. Models shown include cluster center (opaque) and four other models from the 
largest cluster. 

Supplementary Figure 11. Difference map comparison of the ligand-bound and 
ligand-free states of the glycine riboswitch. 

MOHCA-seq data for glycine riboswitch with (a) 10 mM glycine and (b) 0 mM glycine, 
(c) Difference map between the ligand-bound and ligand-free MOHCA-seq data. 
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MOHCA-seq hits enhanced in the ligand-bound state are yellow (positive) and hits 
enhanced in the ligand-free state are cyan (negative). 

Supplementary Figure 12. Difference map comparison of the ligand-bound and 
ligand-free states of the adenosylcobalamin riboswitch. 

MOHCA-seq data for AdoCbl riboswitch with (a) 140 /jM AdoCbl and (b) 0 /vM AdoCbl. 
(c) Difference map between the ligand-bound and ligand-free MOHCA-seq data. 
MOHCA-seq hits enhanced in the ligand-bound state are yellow (positive) and hits 
enhanced in the ligand-free state are cyan (negative). 

Supplementary Table 1. Sequences of RNAs, single-stranded DNA ligation 
adapters, and sequencing primers. 

Supplementary Table 2. Pairwise MOHCA-seq constraints used for de novo 
modeling. 

Supplementary Table 3. Results of de novo modeling incorporating MOHCA-seq 
constraints. 
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Supplementary Figures 1-12 

Supplementary Figure 1. MOHCA-seq data analysis 
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Supplementary Figure 2. Full MOHCA-seq proximity maps, including 5' and 3 
buffer regions 
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Supplementary Figure 3. Variation of radical source incorporation rate 
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Supplementary Figure 4. Alternative nucleotide attachment sites for the radical 
source 
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Supplementary Figure 5. Variation of fragmentation reaction time 
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Supplementary Figure 6. Rosetta scoring potential used to incorporate MOHCA- 
seq constraints in modeling 
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Supplementary Figure 8. Comparison of class I ligase crystal structure and 
knotted and unknotted cluster centers. 
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Supplementary Figure 9. Mutate-and-map analysis of the adenosylcobalamin 
riboswitch, the sixth target of the RNA-puzzles structure prediction trials 
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Supplementary Figure 10. Comparison between MOHCA-seq-guided modeling of 
the adenosylcobalamin riboswitch using different initial RNA fragment sets. 
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Supplementary Tables 1-3 



Supplementary Table 1. Sequences of RNAs, single-stranded DNA ligation adapters, and sequencing primers 

RNAs were transcribed in vitro from PCR-assembled DNA templates which included a promoter sequence (TTCTAATACGACTCACTATA) on the 5' end 
for T7 RNA polymerase. Red = 5'-buffer region; black = region of interest; blue = 3'-buffer and tail region. 'RTU' sequences are reverse transcription 
primers that anneal to the universal miRNA cloning linker. Purple = 12-nt sequence-balanced barcode. 



Name 


Sequence 


Tetrahymena ribozyme 
P4-P6 


GGCCAAAACAACGGAAUUGCGGGAAAGGGGUCAACAGCCGUUCAGUACCAAGUCUCAGGGGAAACUUUGAGAUGGCCUUGCAAA 
GGGUAUGGUAAUAAGCUGACGGACAUGGUCCUAACCACGCAGCCAAGUCCUAAGUCAACAGAUCUUCUGUUGAUAUGGAUGCAG 
UUCAAAACCAAACCAAAGAAACAACAACAACAAC 


Tetrahymena ribozyme 
P4-P6 with flanking 
hairpins 


GGCCAAAGGCGUCGAGUAGACGCCAACAACGGAAUUGCGGGAAAGGGGUCAACAGCCGUUCAGUACCAAGUCUCAGGGGAAACU 
UUGAGAUGGCCUUGCAAAGGGUAUGGUAAUAAGCUGACGGACAUGGUCCUAACCACGCAGCCAAGUCCUAAGUCAACAGAUCUU 
CUGUUGAUAUGGAUGCAGUUCAAAACCAAACCGUCAGCGAGUAGCUGACAAAAAGAAACAACAACAACAAC 


F. nucleatum double 
glycine riboswitch 


AAAGGACUCAUAUUGGACGAACCUCUGGAGAGCUUAUCUAAGAGAUAACACCGAAGGAGCAAAGCUAAUUUUAGCCUAAACUCUC 
AGGUAAAAGGACGGAGAAAACACAAGUUCAGGAGUACUGAACCAAAGAAACAACAACAACAAC 


S. thermophilum 

adenosylcobalamin 

riboswitch 


GGAACAGCCCGAGUAGGGCCGGCAGGUGCUCCCGACCCUGCGGUCGGGAGUUAAAAGGGAAGCCGGUGCAAGUCCGGCACGGU 

CCCGCCACUGUGACGGGGAGUCGCCCCUCGGGAUGUGCCACUGGCCCGAAGGCCGGGAAGGCGGAGGGGCGGCGAGGAUCCG 

GAGUCAGGAAACCUGCCUGCCGGCGCGAGUAGCGCAAACGAAAGAAACAACAACAACAAC 


Class I Liaase 


A A/^/^O/^OA/^l 1 A^/^f , r v A A Al If^f^A^I 1 Af"/^ A A A f~*l IAI 1 A f" 1 1 1 A f" 1 ! A I IA Al 1 A A A A A A Al 1^1 IPP^PO A A /""• f" 1 1 II 1 A /"» A AP A 1 1 

CGAAACACGAUGCAGAGGUGGCAGCCUCCGGUGGGUUAAAACCCAACGUUCUCAACAAUAGUGAAAAGCGCGAGUAGCGCAACA 
AAGAAACAACAACAACAAC 


Universal miRNA cloning 
linker (NEB) 




Second ligation adapter 


pAGATCGGAAGAGCGGTTCAGCAGGAATGCCGAGACCGATCTCGTATGCCGTCTTCTGCTTGp 


RTU048 


AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCTC" ATTGATGGTGCCTACAG 


RTU049 


AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCTT/ ATTGATGGTGCCTACAG 


RTU050 


AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCTG( iGATTGATGGTGCCTACAG 


RTU051 


AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCTAC DTATTG ATG GTG CCTACAG 


RTU 101 


AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCTT1 iGATTGATGGTGCCTACAG 


RTU 102 


AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCTAC ATTGATGGTGCCTACAG 


RTU 103 


AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCTCC DTATTGATGGTGCCTACAG 


RTU 104 


AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCTG/ rAATTG ATG GTG CCTACAG 


RTU 105 


AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCTC( ATTGATGGTGCCTACAG 


RTU 106 


AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCTT1 ^ATTGATGGTGCCTACAG 


RTU 107 


AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCTG/ ATTGATGGTGCCTACAG 


RTU 108 


AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCTAC iGATTGATGGTGCCTACAG 
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Supplementary Table 2. Pairwise MOHCA-seq constraints used for de novo modeling 



Tetrahymena 
ribozyme P4-P6 


5'-end 


3-end 


r. nucisaium uouuie 
glycine riboswitch 
(10 mM glycine) 


5'-end 


3-end 


r. nucisaium uouuie 
glycine riboswitch (0 
mM glycine) 


5'-end 


3-end 


Strong 


126 


188 


Strong 


2 


38 


Weak 


2 


38 




123 


194 




1 


44 




11 


26 




132 


178 




5 


60 




2 


64 




130 


166 




2 


64 




5 


57 




132 


162 




25 


54 




25 


57 




132 


159 




45 


64 




45 


64 




137 


156 




45 


75 




78 


113 




136 


162 




32 


88 




86 


108 




154 


178 




42 


84 




136 


154 




171 


211 




48 


84 




100 


148 




170 


198 




55 


88 




105 


154 




170 


190 




55 


108 




74 


157 




170 


178 




58 


118 










185 


196 




67 


119 










185 


212 




67 


121 










189 


211 




78 


113 










189 


198 




78 


135 










172 


221 




42 


157 










239 


247 




74 


156 










208 


259 




100 


148 










209 


254 




100 


145 










215 


248 




113 


153 










221 


244 




135 


154 










227 


238 




5 


119 










221 


232 


Weak 


25 


88 










221 


228 




37 


62 








Weak 


189 


224 




79 


103 










202 


214 




15 


88 










114 


168 




32 


108 
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189 


224 




9 


138 










202 


214 




25 


118 










151 


225 
















108 


254 
















154 


254 
















189 


255 
































S. thermophilum 
adenosylcobalamin 
riboswitch (140 juM 
adenosylcobalamin) 


5'-end 


3-end 


S. thermophilum 
adenosylcobalamin 
riboswitch (0 juM 
adenosylcobalamin) 


5'-end 


3-end 


Class I ligase 


5'-end 


3-end 


Strong 


11 


26 


Strong 


12 


25 


Strong 


6 


41 




26 


44 




9 


26 




10 


40 




5 


42 




26 


42 




31 


84 




26 


75 




26 


76 




42 


109 




35 


75 




42 


56 




42 


116 




53 


77 




52 


77 




54 


79 




35 


89 




76 


88 




63 


88 




24 


89 




75 


103 




68 


89 




52 


89 




85 


133 




67 


94 




52 


64 




103 


121 




63 


83 




75 


88 




35 


167 




70 


103 




74 


103 




53 


153 




64 


103 




73 


117 




66 


154 




64 


109 




90 


100 




67 


162 




78 


109 




85 


132 




74 


147 




80 


103 




53 


154 




79 


144 


Weak 


5 


49 




67 


153 




86 


133 




5 


110 




67 


162 




150 


161 




27 


48 




73 


147 


Weak 


6 


41 




35 


70 




79 


142 




35 


88 




47 


87 




102 


147 




75 


117 




47 


103 




102 


157 




75 


132 




47 


70 




118 


147 




90 


107 




76 


113 
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128 


158 




6 


162 




76 


116 




146 


158 




43 


154 








Weak 


14 


90 




46 


162 










35 


102 




67 


158 










59 


76 




104 


147 










56 


64 




104 


157 










102 


121 




127 


158 










54 


126 




127 


147 










52 


144 




118 


147 










65 


147 




158 


158 










5 


162 
















43 


154 
















35 


166 
















118 


157 
















127 


147 
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Supplementary Table 3. Results of de novo modeling incorporating MOHCA-seq constraints 





# of 
models 


Size of 
largest 
cluster 


nivi ou , 

cluster 
center (A) 


nivi ou , 

cluster 
median (A) 
(accuracy) 


\-/ iu bic 1 1 1 iy 

threshold 
(A) 

(precision) 


Total 

constraints 


Percent 

1 fci 1 Lit? 1 1 I 

strong 
constraints 
satisfied 3 , 
models 


i fc? 1 Ut; 1 1 1 

strong 
constraints 
satisfied, 
crystal 15 


l fci 1 Lit? 1 1 l 

weak 

constraints 

satisfied, 

models 


l ti 1 Lit? 1 1 l 

weak 

constraints 

satisfied, 

crystal 15 


Tetrahymena ribozyme P4-P6 


61,115 


48 


8.6 


12.3 


11.7 


35 


79.3 


69.2 


51.4 


44.4 


F. nucleatum double glycine 
riboswitch, bound 


16,506 


14 


7.9 


10.3 


7.1 


31 


93.5 


100 


70.4 


42.9 


F. nucleatum double glycine 
riboswitch, unbound 15 


15,771 


12 


25.4 


17.7 


18.9 


12 


N/A c 


N/A c 


64.6 


83.3 


Class 1 ligase (unknotted) 


17,881 


7 


14.5 


15.4 


8.2 


24 


81.9 


53.3 


44.4 


77.8 


Class 1 ligase (knotted) 


17,881 


14 


16.1 


15.5 


8.2 


24 


85.2 


53.3 


61.9 


77.8 


S. thermophilum 
adenosylcobalamin riboswitch, 
bound" 


14,219 


12 


12.1 


12.4 


9.8 


38 


84.7 


76.0 


85.3 


69.2 


S. thermophilum 
adenosylcobalamin riboswitch, 
unbound" 5 '" 


1 1 ,980 


10 


17.3 


15.6 


13.9 


33 


82.8 


94.4 


82.7 


73.3 



a Constraints were considered satisfied if the 02' of the 5'-residue was less than 30 A from the C4' of the 3'-residue (Supplementary Table 2). 
b For ligand-free states of the glycine riboswitch and adenosylcobalamin riboswitch, RMSDs and percent constraints satisfied were calculated for 
gold standard crystal structures solved in the presence of ligand. 
c No strong constraints were selected for the unbound state of the glycine riboswitch. 

d (See description of AdoCbl riboswitch modeling in Methods.) Three separate modeling runs were performed for each ligand-binding state of the 
AdoCbl riboswitch. The number of models, size of largest cluster, cluster center RMSD, median Rosetta energy score, and percent constraints 
satisfied statistics for the AdoCbl riboswitch are representative data from the modeling runs with the most models generated. The cluster median 
(accuracy) shown is the median of the three cluster median RMSDs of the largest clusters generated in each of the three modeling runs. The 
clustering threshold (precision) shown is calculated as the mean pairwise RMSD between the cluster centers of the largest clusters generated in 
each of the three modeling runs. 
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